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Abstract — One approach to designing structured low-density 
parity-check (LDPC) codes with large girth is to shorten codes 
with small girth in such a manner that the deleted columns of the 
parity-check matrix contain all the variables involved in short 
cycles. This approach is especially effective if the parity-check 
matrix of a code is a matrix composed of blocks of circulant 
permutation matrices, as is the case for the class of codes known 
as array codes. We show how to shorten array codes by deleting 
certain columns of their parity-check matrices so as to increase 
their girth. The shortening approach is based on the observation 
that for array codes, and in fact for a slightly more general class 
of LDPC codes, the cycles in the corresponding Tanner graph are 
governed by certain homogeneous linear equations with integer 
coefficients. Consequently, we can selectively eliminate cycles 
from an array code by only retaining those columns from the 
parity-check matrix of the original code that are indexed by 
integer sequences that do not contain solutions to the equations 
governing those cycles. We provide Ramsey-theoretic estimates 
for the maximum number of columns that can be retained 
from the original parity-check matrix with the property that 
the sequence of their indices avoid solutions to various types 
of cycle-governing equations. This translates to estimates of the 
rate penalty incurred in shortening a code to eliminate cycles. 
Simulation results show that for the codes considered, shortening 
them to increase the girth can lead to significant gains in signal- 
to-noise ratio in the case of communication over an additive white 
Gaussian noise channel. 

Index Terms — Array codes, LDPC codes, shortening, cycle- 
governing equations 

I. Introduction 

Despite their excellent error-correcting properties, low- 
density parity-check (LDPC) codes with random-like structure 
[8], [17, pp. 556-572] have several shortcomings. The most 
important of these is the lack of mathematical structure in 
the parity-check matrices of such codes, which leads to 
increased encoding complexity and prohibitively large storage 
requirements. These issues can usually be resolved by using 
structured LDPC codes, but at the cost of some performance 
loss. This performance loss may be attributed to the fact that 
algebraic code design techniques introduce various constraints 
on the set of code parameters influencing the performance of 
belief propagation decoding, so that it is hard to optimize the 
overall structure of the code. 

One parameter that is usually targeted for optimization in 
the process of designing structured LDPC codes is the girth 
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of the underlying Tanner graph. Several classes of structured 
LDPC codes with moderate and large values of girth and good 
performance under iterative decoding are known, examples of 
which can be found in [10], [12]-[14], [19], [22], [26], [29]. 
In this paper, we focus our attention on a class of LDPC 
codes termed array codes [5] (or equivalently, lattice codes 
[29]). These codes are quasi-cyclic, and have parity-check 
matrices that are composed of circulant permutation matrices. 
General forms of such parity-check matrices were investigated 
in [6] and [27], and codes of girth eight, ten and twelve were 
obtained primarily through extensive computer search. 

Fossorier [6] considered a family of quasi-cyclic LDPC 
codes closely related to array codes, and derived simple 
necessary and sufficient conditions for such codes to have girth 
larger than six or eight. Subsequently, codes with large girth 
were constructed with the aid of computer search strategies 
which rely on randomly generating integers until the condi- 
tions of the theorem are met. 

We generalize and extend the array code design methods in 
a slightly different direction, and provide a less computation- 
intensive approach to constructing codes with large girth 
(including values exceeding eight). Our approach is based 
on the observation that the existence of cycles in the Tanner 
graph of an array code is governed by certain homogeneous 
linear equations. We show that it is possible to exhaustively 
list all the equations governing cycles of length six, eight 
and ten in an array code having a parity-check matrix with 
a small number of ones in each column. Thus, by shortening 
an array code in such a way as to only retain those columns 
of its parity-check matrix whose indices form a sequence 
that avoids solutions to some of these "cycle-governing" 
equations, one can obtain array codes with a pre-specified 
distribution of cycles of various lengths. This provides a means 
of studying the effects of different classes of cycles on array 
code performance. In particular, this technique can be used to 
entirely eliminate cycles of short lengths, resulting in codes 
of girth up to twelve. One special form of an array code of 
girth eight and column-weight three was first described in [29] 
and [30], where a good choice for the set of columns to be 
retained from the original parity-check matrix was determined 
using geometrical arguments. 

Using techniques from graph theory and Ramsey theory, we 
provide analytical estimates of the designed code rates achiev- 
able by shortening an array code to improve girth, and present 
some useful algorithms for identifying large sets of column- 
indices that avoid solutions to cycle-governing equations. 
Simulation results show that eliminating short cycles using 
this technique leads to significant signal-to-noise ratio (SNR) 
gains, over the additive white Gaussian nose (AWGN) channel. 
These codes also compare favorably with other classes of 
structured LDPC codes in the literature, and in fact show 
marked improvement in performance in some cases. 
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The remainder of the paper is organized as follows. Sec- 
tion 2 describes a generalization of the array code construction 
and provides some definitions needed for the subsequent 
exposition. In Section 3, we explicitly show how cycles in 
the Tanner graphs of these codes are governed by certain 
homogeneous linear equations with integer coefficients. We 
then go on to list the equations governing cycles of length 
six, eight and ten in array codes with parity-check matrices of 
small column-weight. Section 4 contains bounds on the size 
of the maximal sequence of column indices that contains no 
solutions to certain homogeneous linear equations. A greedy 
algorithm for constructing such sequences, as well as some 
simple extensions thereof, are discussed in Section 5. Simu- 
lation results are given in Section 6, with some concluding 
remarks presented in Section 7. The proofs of some of the 
results of Section 4 are provided in the Appendix. 

II. Array Codes 

Array codes [5] are structured LDPC codes with good 
performance under iterative message-passing decoding. Their 
parity-check matrix has the form 



/ 

P 



I F' 



I 

pq-l 

p(r-l)(q-l) 



(1) 



where q is an odd prime, r is an integer' in [1, q\, I is the qxq 
identity matrix, and P is a g x g circulant permutation matrix 
distinct from /. Recall that a permutation matrix is a square 
matrix composed of O's and I's, with a single 1 in each row 
and column. A circulant permutation matrix is a permutation 
matrix that is also circulant, i.e., the ith row of the matrix can 
be obtained by cyclically shifting the {i — l)th row by one 
position to the right. Typically, the matrix P in Q is chosen 
to be the matrix 



P = 



1 





1 







1 
... 



An LDPC code described by such a parity-check matrix is 
regular, with length q^ and co-dimension r q. The row and 
column weights of such a code are q and r, respectively. 
Consequently, the rate R of such codes is at least 1 — r/q. 

We will consider the following more general form for a 
parity-check matrix: 

pao-O pao-l . , . pao-(g-l) 



H = 



par-i-iq-1) 



(2) 



where ao, ai, . . . , a^-i is some sequence of r distinct integers 
from [0, q~l]. Each such parity-check matrix defines a code. If 
the sequence ap, ai, . . . , a^-i forms an arithmetic progression 

'in this paper, we will use the notation [a, b] to denote the set {x G Z : 

a < X < b}. 



(A. P.), i.e., if there exists an integer a ^ such that Oi+i— = 
a for i — 0,1,2, ... ,r — 2, then we call the corresponding 
code a proper array code (PAC). Note that if oq — 0, then the 
PAC is simply an array code with parity-check matrix H^^ 
as in Q, since the parity-check matrix in (|3 has the same 
form as iJan, as can be seen by replacing P in iJ^rr by P". If 
the sequence oq, ai, . . . , a^-i does not form an A. P., then the 
corresponding code will be referred to as an improper array 
code (lAC). The term array code without further qualification 
will henceforth be used to mean an I AC or a PAC. 

Throughout the remainder of the paper, we will use the 
following definitions/terminology: 

• The odd prime q used in defining the parity-check matrix 
of an array code will be referred to as the modulus of the 
code. 

• A block-column (block-row) of a parity-check matrix, H, 
of an array code is the submatrix formed by a column 
(row) of permutation matrices from H. The q block- 
columns of H are indexed by the integers from to g — 1, 
and the r block-rows are indexed by the integers from 
to r — 1. For example, the jth block-column of H is the 
matrix \P°-°'i P'^^'i po-2 j pa^_i ijT 

• The term block-row labels will be used to denote the 
integers in the sequence ao, oi, . . . , a^-i that define the 
matrix H in 

• A block-column-shortened array code, or simply a short- 
ened array code, is a code whose parity-check matrix is 
obtained by deleting a prescribed set of block-columns 
from the parity-check matrix of an array code. 

• The labels of the block-columns retained in the parity- 
check matrix of the shortened code are simply their 
indices in the parent code. For the parent code itself, 
the terms "label" and "index" for a block-column can be 
used interchangeably. 

• A closed path of length 2k in any parity-check matrix 
of the form in (|2ji is a sequence of block-row and block- 
column index pairs {ii,j2), {12, j2), {i2,iz), 
[ikjk), (ikji), with ii ^ it+i, jt ^ ji+i, for t = 
1, 2, . . . , fc - 1, and ife ^ ii, jk ^ ji. 

The significance of closed paths arises from the following sim- 
ple but important result from [5] (see also [6, Theorem 2.1]): 

Theorem 1. A cycle of length 2k exists in the Tanner graph of 
an array code with parity-check matrix H and block-row labels 
aQ,ai, . . . , flr-i if and only if there exists a closed path (ii, ji), 
(n,i2), (i2,i2), («2,i3), {ik,jk), {ik,ji) in H such that 

pai-^-jl ^paii-j2^-l pai.2-j2 (^pai^-js'-j-i ... p"-ik'ik (^pa.i^^-ji^-1 

evaluates to the identity matrix I. 

In fact, since P is a g x g circulant permutation matrix, P ^ I, 
and q is prime, we can have P" = / if and only if n = 
(mod q). So, the condition in the theorem is equivalent to 

Oil (ji - J2 ) + ai2 O2 - J3) H ^ flifc {jk - ji ) = (mod q) , 

(3) 
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which can also be written as 

jl(ail-aiJ+j2(a^2-a^l)^ ^jk{atk-ai^-i) = (mod q). 

(4) 

Based on Theorem ^ it is easily seen [5] that array codes 
are free of cycles of length four. This is because a cycle of 
length four exists if and only if there exist indices ii, i2,ji,j2, 
ii 7^ «2, ji 7^ 32 such that 

(a»i - 0*2) (ii - ^2) = (mod q). 

which is clearly impossible since 12 and ji ^ j2- 

On the other hand, an array code with a parity-check matrix 
of the form in Q, with q > 5,r > 3, has cycles of length six. 
An example is the closed path described by the coordinates 
(1,1), (1,2), (2, 2), (2,2±3),(o,£±3),(o,l), which satisfies 
(jSj, since ai = i in this case, and 

1 (l-2) + 2 (2-ii^) + (^-1) = -q^0 (mod q). 

In general, a closed path of length six in the parity-check 
matrix of an array code must pass through three different 
block-rows, indexed by ri,r2,r3, and three different block- 
columns, indexed by k. In the case of a PAC, the block-row 
labels flo, ai, . . . , Or-i form an A. P. with common difference 
a, < \a\ < q, and hence (|4} reduces to 

a [i{ri - ra) +j{r2 - ri) + kir^ - ra)] = (mod q). 

Thus, a PAC has a cycle of length six if and only if there exist 
distinct block-row indices ri , r2 , and distinct block-column 
indices i,j,k such that 

i{ri - rs) + j{r2 - n) + k{rz - r2) = (mod q). (5) 

Therefore, by shortening the PAC so as to only retain block- 
columns with labels such that (|5} is never satisfied, we 
eliminate all cycles of length six, obtaining a code of girth 
at least eight. 

It is naturally of interest to extend this kind of analysis to 
cover the case of cycles of length larger than six, and utilize it 
to appropriately shorten an array code to increase its girth. The 
next section deals with the subject of identifying sequences of 
block-column labels leading to codes with large girth. 

III. Array Codes of Girth Eight, Ten, and Twelve 

For clarity of exposition, in all subsequent derivations we 
will focus only on the two special cases of array codes with 
column weight three and four The results presented can be 
extended in a straightforward, albeit tedious, manner to codes 
with larger column weights. 

Theorem 2. Let C be a PAC with modulus q whose parity- 
check matrix, H, has column weight r. Ifr = 4, then C contains 
a cycle of length six if and only if there exist three distinct block 
columns in H whose labels k satisfy at least one of the 
following two congruences: 

-2i+j + kEE0 (mod g), 
-3i+j + 2k = (modg). ^' 

Ifr = 3, then C contains a cycle of length six if and only if there 
exist three distinct block columns whose labels i,j,k satisfy the 
first of the two equalities. 



Proof. The claim for r = 4 follows immediately from ^ 
once we note that any three block-row indices ri , r2 , G 
{0, 1, 2, 3}, ri < r2 < r^, must satisfy one of the following: 
(i) n-rs ^ -2, ra - r2 = 1, (ii) ri - rs = -3, ra - r2 = 1, 
or (iii) n - ra = -3, ra - r2 = 2. 

The proof for the r = 3 case similarly follows from the 
fact that the only possible choice for the set of three distinct 
block-row labels in this case is {0, 1, 2}. ■ 

A useful consequence of the above result is Corollary |3] 
below, to state which it is convenient to introduce the following 
definition. Here, and in the rest of the paper, the set of positive 
integers is denoted by Z+, and given an G Z+, the ring of 
integers modulo N is denoted by Z^. 

Definition 1. A sequence of distinct non-negative integers 
ni, n2, na, . . . is defined to be a non-averaging sequence if it 
contains no term that is the average of two others, i.e., rii+rij — 
2nk only if i = j = k. Similarly, given an N ^ Z+, a sequence 
of distinct integers rti, rt2, n.3, • ■ • in [0, TV — 1] is non-averaging 
overZ^ ifrii + rij = 2nk (mod N) implies thati ^ j = k. 

It is clear from the definition that a sequence is non- 
averaging if and only if it contains no non-constant three- 
term A. P. The following result is a simple consequence of 
Theorem 12 and Definition [2 

Corollary 3. Let H be the parity -check matrix of a PAC with 
modulus q, consisting of three block-rows, and let A be the 3q x 
mq matrix obtained by deleting some q — m block-columns 
from H. The shortened array code with parity-check matrix A 
has girth at least eight if and only if the sequence of labels of 
the block-columns in A forms a non-averaging sequence over 

To extend the above result to PAC's with four block-rows, 
we require the following generalization of Definition Q] 

Definition 2. Let c be a fixed positive integer. A sequence of 
distinct non-negative integers ni, 7^2, ria, . . . is defined to be a 
c-non-averaging sequence if rii + cnj — {c+ l)nk implies that 
i = j ~ k. We extend this definition as before to sequences 
overZ^, for an arbitrary N G Z+. 

Note that a sequence is c-non-averaging if and only if it does 
not contain three elements of the form n, n+t, n+ (c+ l)t, for 
some integers n, t, with t > 0. We can now state the following 
corollary to Theorem |2] 

Corollary 4. Let H be the parity-check matrix of a PAC with 
modulus q, consisting of four block-rows, and let A be the Aq x 
mq matrix obtained by deleting some q — m block-columns 
from H. The shortened array code with parity-check matrix A 
has girth at least eight if and only if the sequence of block- 
column labels in A is non-averaging and 2-non-averaging over 

We next consider the case of cycles of length eight. By 
the reasoning used to derive (|5}, it follows from Theorem ^ 
that a PAC contains a cycle of length eight if and only if 
its parity-check matrix contains a closed path of the form 
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iri,i), (nj), ir2,j), ir2,k), (r3,fc), (rsJ), {r^J), {ri,i) 
such that 

i{ri-r4)+j{r2~ri) + k{r3-r2) + l{r4-r3) = (mod q) 

(7) 

Note that closed paths of length eight may pass through 
two, three or four different block-columns of the parity-check 
matrix of the PAC. 

Let us first consider the situation where a closed path passes 
through exactly two different block-columns. Let i and j be 
the labels of these block-columns. This closed path forms a 
cycle of length eight if and only if Q is satisfied with k = i 
and I = j. A re-grouping of terms results in the equation 

- j)iri + rs - r2 - r^) = (mod g) 

which, for i ^ j, is satisfied if and only if 

ri + rs - r2 - Vi = (mod q). (8) 

Now, observe that for a PAC with column-weight r > 3, the 
above equation is always satisfied by taking ri = 0, r2 = 1, 
r3 = 2 and = 1. This shows that in a PAC with column- 
weight r > 3, any pair of block-columns is involved in a 
cycle of length eight. Hence, shortening will never be able 
to eliminate cycles of length eight from such a PAC (except 
obviously in the trivial case where we delete all but one block- 
column), implying that shortened PAC's can have girth at most 
eight. We record this fact in the lemma below. 

Lemma 5. A shortened PAC of column-weight at least three 
has girth at most eight. 

The following theorem provides the constraining equations 
that govern cycles of length eight involving three of four 
different block-columns in a PAC with row-weight q and 
column-weight three or four The proof is along the lines of 
that of Theorem 121 and is omitted. 

Theorem 6. In a PAC with modulus q and column-weight 
r — 3, the constraining equations, over the ring Zq, for the 
block-column labels i,j, k, I specifying cycles of length eight 
involving three or four different block-columns are 



i - j - k + I = 0, 
2i + j-3k^ 0, 



2i-j -2k + l = 
2i-j-k^0 



(9) 



For PAC's with modulus q and column-weight r ~ A, the set 
of constraining equations, over Zq, for the labels i, j, k, I that 
describe cycles of length eight involving three or four different 
block-columns is 

3i - j - k - I ^ 0, 3i - 2j - 2k + I ^ 0, 
3i-3j + k-l = 0, 3i-3j + 2k-2l^0, 
2i-2j + k-l^0, i + j-k-l^O, ^ ' 

2i~j-k = 0, Ai- 3j - = 0, 3i - 2j - fc = 

Figure Q] shows the structures of some cycles of lengths 
six and eight, and provides the modulo-g equation governing 
each such cycle. The generic variables a, 6, c and i,j,k,l 
represent the block-row and block-column labels, respectively. 
The equations governing all such cycles are also summarized 
in Tables HH and Uni 

It should be abundantly clear by now that we can eliminate 
a large number of cycles of length eight from a PAC by 
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(a-b)i + (2b-a-c)j + (c-b)k = 



i-j+k-l = 



Fig. 1. Some cycles of lengths six and eight, and their governing equations. 



selectively deleting some of its block-columns, retaining only 
those block-columns the set of whose labels does not contain 
solutions to some or all of the equations listed in Theorem |6l 
Note also that the equations listed in (|6}, upon relabeling the 
variables if necessary, form a subset of the equations listed in 
(|9j, as well as of those in ilO\ . Hence, if we shorten a PAC in 
such a way as to retain only those block-columns whose labels 
form a non-averaging and 2-non-averaging sequence, not only 
does the resultant shortened code have no cycles of length six, 
but it also has fewer cycles of length eight than the original 
code. 

As observed earlier, shortened PAC's cannot have girth 
larger than eight. This is a direct consequence of the fact that 
the block-row labels of a shortened PAC with column-weight 
at least three always contain a solution to (|8ji, and hence any 
such code always contains cycles of length eight that pass 
through pairs of distinct block-columns. On the other hand, 
lAC's can be constructed in such a way as to avoid cycles 
of length eight that involve only two different block-columns. 
Analogous to (jSj, the equation governing such cycles in an 
lAC is 



= (mod q). 



(11) 



Thus, if the block-row labels of the lAC are chosen so that 
they do not contain solutions to ( II 1> . then such eight-cycles 
cannot arise. Examples of such sets of block-row labels are 
{0, 1, 3} for an lAC with three block-rows, and {0, 1, 3, 7} for 
an lAC with four block-rows. Such lAC's can be shortened to 
yield codes with girth ten or twelve, provided that the block- 
column labels retained in the shortened code avoid a set of 
constraining equations analogous to (|6j, (|9} and ( I10> . The 
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equations governing cycles of lengths six, eight and ten for 
lAC's with three block-rows (r — 3) and label set {0,1,3} 
are listed in Table IIVI Similarly, Table |V] lists the twenty- 
eight equations governing cycles of lengths six and eight in 
lAC's with four block-rows (r = 4) and label set {0, 1, 3, 7}. 
There are more than fifty equations governing cycles of length 
ten in lAC's with r = 4. These equations were obtained 
via an exhaustive computer-aided analysis of all the possible 
structures that cycles can have in these codes. 

It is worth pointing out that Tables [TTHV] need not only be 
used to construct codes with a prescribed girth, but can also 
be used to design codes with a pre-specified set of cycles. This 
can help in studying the effects of various cycle classes on the 
performance of a code. 

The structure of the parity-check matrix in an array code 
allows us to use existing results in the literature to obtain upper 
and lower bounds on the minimum distance, d, of such codes. 
A lower bound on d for regular LDPC codes was derived in 
[28]: 



d > 



2 (-^^Tf'-^ + 2 (r - l)(9-2)/4^ odd 



r-2 

^ r-2 ' 



g/2 even 



(12) 

where g is the girth of the code and r is the column-weight 
of the parity-check matrix {i.e., the degree of any variable 
node). This bound can be improved slightly in some cases by 
noting that the minimum distance of an array code must be 
even, since the code can have even-weight codewords only. 
This is a consequence of the fact that within any block-row, 
[pa, o pa, i pa,.2 pai-(9-i)]^ parity chcck matrix 

of an array code, the rows sum to [1 1 1 ... 1], and hence the 
dual of an array code always contains the all-ones codeword. 

For bounding d from above, we make use of a particularly 
elegant result due to MacKay and Davey [18], which shows 
that parity-check matrices containing an r x (r + 1) grid 
of permutation matrices j that commute {i.e., for which 
Pi.jPk.i = Pk.iPi,j) must have minimum distance at most 
(r + 1)!. TableUlists the lower and upper bounds on minimum 
distance for array codes with column-weight r G {3,4} and 
girth g e {8, 10, 12}. 
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Fig. 2. Cycle of length twelve in an array code. 



Nevertheless, using a masking approach, array codes can 
be modified so that their girth exceeds twelve. Masks were 
introduced in [20] for the purpose of increasing the girth of 
codes as well as for constructing irregular LDPC codes. As an 
illustrative example, consider the matrix M in il4\ below. It 
consists of qxq zero matrices and qxq circulant permutation 
blocks Pi — P^\ for some integers bi. One can view M as 
arising from a parity-check matrix of an array code, or more 
generally, a quasi-cyclic code with circulant permutations 
blocks, from which some blocks are "zeroed out" according 
to a given mask. The matrix M does not contain a submatrix 
of the form depicted in Figure |5| Consequently, there exist no 
length- 12 cycles that traverse exactly six permutation matrix 
blocks. Of course, this is achieved at the expense of increased 
code length (for the given example, the length has to be 
doubled). Other kinds of length-12 cycles may still exist, 
but these are governed by non-trivial homogeneous linear 
equations similar in form to those governing shorter cycles, 
and can be eliminated by a judicious choice of the exponents 
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A. The Code Mask 

Array codes, as well as the general class of quasi-cyclic 
LDPC codes with parity-check matrices consisting of blocks 
of circulant permutation matrices, cannot have girth exceed- 
ing twelve [6]. This is most easily seen by examining the 
example in Figure |2| There, a sub-matrix of a parity check 
matrix consisting of circulant permutation blocks Pi, i ~ 
1, 2, . . . , 6, is shown, along with a directed closed path labeled 
abcdef ghijkl that traverses the blocks. Setting Pi = P^' for 
some circulant permutation matrix P and exponents hi, we see 
that the condition in Theorem ^ is satisfied, since 

bi-b4 + b5-b2 + b3-b(i + b4-bi + b2-b5 + be-b3 = 0. (13) 

Thus, length-12 cycles are guaranteed to exist in any quasi- 
cyclic LDPC code with parity-check matrix consisting of 
blocks of circulant permutation matrices. 



IV. Avoiding Solutions to Cycle-Governing 
Equations 

Cycle-governing equations, such as those listed in Tables HH- 
|yj are always of the following type: 

ni 

^c,Ui = (modg), (15) 
1=1 

the integer m being the number of distinct block-columns 
through which the cycle passes, the U^'s being variables^ that 
denote the labels of those m block-columns, and the c/s 
being fixed nonzero integers (independent of q) such that 
^™ Ci — 0. This is because all such equations arise as 
special cases of an equation of the form 0, and clearly, 

^To avoid the sloppiness of using Ui to denote both a variable and a value it 
can take, we will make typographical distinctions between the two whenever 
necessary. 
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TABLE I 

Bounds on the minimum distance, d, of array codes for various values of column-weight, r, and girth, g. 





r = 3 


r = 3 


r = 4 


r = 4 


G'\Ah g 


Lower bound on d 


Upper bound on d 


Lower bound on d 


Upper bound on d 


8 


6 


24 


8 


120 


10 


10 


24 


14 


120 


12 


14 


24 


26 


120 



(fljj - J + (ai2 - aij + • • • + [a,,^ - ai^_^) = 0. Any 
solution u — (ui, U2, ■ • ■ , Wm) to ( I15> . with Ui E [0,q — 1], 
and such that Ui ^ Uj when i ^ j, represents a cycle passing 
through the m block-columns whose labels form the solution 
vector u. 

To avoid potential ambiguity, we establish some terminol- 
ogy that we will use consistently in the rest of the paper. Given 
a homogeneous linear equation of the form X]I=i CjUi = 0, 
we refer to a vector (ui, M2, . . . , Um) G [0,?— 1]™ as a 
solution over Zg to the equation if CjU^ = (mod q). 

If u = (ui, U2, . . . , Mm) € is such that X^t^l '^i'^i — 0' 
then u is referred to as an integer solution to the equation. In 
both cases, a solution u = (ui,it2, • ■ ■ ,Um) to M5\ . with all 
the Ui's distinct, will be referred to as a proper solution. 

The design of a shortened array code typically involves 
determining the smallest prime q for which there exists a 
sequence of integers S G [0, g — 1] of some desired cardinality 
s, such that there is no proper solution with entries in S to 
any equation within a certain set of cycle-governing equations. 
This choice of q would guarantee the smallest possible code 
length, equal to q s, for a PAC or an lAC with prescribed girth, 
column-weight r and designed code rate R = 1 — r/s. For 
example, if we seek an lAC with r = 3, designed rate R = 1/2 
and girth ten, then we need the smallest q that guarantees the 
existence of a set S of cardinality at least six that does not 
contain a proper solution to any of the equations listed in 
Table II Vl It is therefore useful to estimate, as a function of q, 
the size of the largest subset of [0, g — 1] that avoids proper 
solutions to certain Unear equations of the form given in ( I15t . 
In this section, we provide a number of results that bound the 
size of such a largest subset. 

Equations of the form J27Li CiU^ = 0, with J27Li Ci = 0, 
have been extensively studied in Ramsey theory [9, Chapter 3], 
[15, Chapter 9]. It is known [7, Fact 3] that any such equation 
that is not of the form Ui — U2 = (or an integer multiple of 
it) has a proper solution. In fact [7, Theorem 2], for any e > 
and sufficiently large A^, if L C [1, A^] is such that \L\ > eN, 
then L contains a proper solution to such an equation. This 
implies the following result: 

Theorem 7. Let m > 3, and let Ci, i = 1,2, ... ,m, be nonzero 
integers such that X]"=i — 0. For an arbitrary q > 1, let s{q) 
be the size of the largest subset of [0, q—l] that does not contain 
a proper solution to J2iLi ^iUi = (mod q). Then, 

lim ^ = 0. 

13^00 q 

Proof. Let S{q) C [0,q — 1] be a set of size s{q) that 
does not contain any proper solution to Y^™^i ^i^i = 
(mod q). Clearly, S{q) does not contain a proper solution to 



SilLi CjU,; = (without the modulo-g reduction) as well. Note 
that since (1,1,..., 1) is a solution to '^i^i — {ui) is 

a solution iff {ui + 1) is a solution. Thus, L{q) = S{q) + 1 = 
{ j + 1 : j e S{q)} is a set of cardinahty s{q) in [l,q] that 
does not contain a proper solution to X^I^i CjUi — 0. Hence, 
for any e > 0, we must have s{q) < eq for all sufficiently 
large q, and the desired result follows. ■ 

We have thus established that the size of a subset of [0, q— 1] 
containing no proper solution to any equation from a given set 
of cycle-governing equations grows sub-linearly in q. This is 
a disappointing result from the point of view of our strategy 
of shortening array codes to eliminate cycles. Indeed, starting 
with an array code of column-weight r, length q^ and designed 
rate 1 — r/q, if we shorten the code so as to eliminate cycles 
governed by an equation of the form X)"=i CjUi = (mod q), 
the resulting shortened code can have rate no larger than 
1 — r/s{q), where s{q) is as defined in the statement of 
Theorem Since s{q)/q goes to as g increases, the rate 
penalty associated with shortening is severe for large values 
of q (or equivalently, for large values of the length of the 
parent code). However, from a practical standpoint, this does 
not appear to be a problem, as for the moderate values of q 
useful in practical code constructions, the rate penalty incurred 
by shortening remains within reasonable limits. Consequently, 
it is possible to construct, for example, designed rate- 1/2 codes 
of girth eight and ten that perform much better than the 
comparable codes in the existing literature, as we shall see 
in Section 6. 

A precise estimate of the rate at which s{q)/q goes to zero 
for various types of cycle-governing equations can be very 
useful for the purpose of practical code design, as this provides 
us with an understanding of how the rate penalty incurred 
in shortening an array code changes with the modulus q. 
More generally, given a collection, il, of homogeneous linear 
equations over Zg of the form M5\ . let s{q; Vl) be the size of 
the largest subset of [0, q—l] that does not contain a proper 
solution over Zg to any of the equations in 57. From the result 
of Theorem^ it is clear that s{q; Vl) grows sub-linearly with q. 
In the rest of this section, we provide upper and lower bounds 
on s{q; Vl) for various choices of Vl. 

A. Upper bounds on s{q; Vl) 

Explicit upper bounds for s{q; VI) can be obtained for any 
Vl containing an equation (over Zg) of the form 2x — y — Z = 
or X + y — Z — U = 0. These equations have been extensively 
studied in other contexts, and in such cases, there are good 
estimates available for the growth rate of sequences avoiding 
solutions to these equations. 
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Recall from Definition that sequences avoiding proper 
solutions to 2x — y— Z = are called non-averaging sequences. 
Correspondingly, sequences avoiding proper solutions to the 
equation X + y — Z — U = are called Sidon sequences (see 
e.g. [23]), as made precise by the definition below. 

Definition 3. A Sidon sequence is a sequence of distinct inte- 
gers ni, 712, ns, . . . with the property that for all i, j, k, I such 
thati ^ j, k ^ I, Tii + Uj ~ Uk + ni if and only if {i,j} = 
{k,l}. Similarly, given an N ^ Z+, a Sidon sequence over Zjv 
is a sequence of distinct integers ni,n2, n^, . . . in [0, iV — 1] 
such that for all i, j, k, I with i ^ j , k ^ I, rii + nj = rik + ni 
(mod N) if and only if = {k, I}. 

Upper bounds on the sizes of non-averaging sequences 
and Sidon sequences over Zjv are given in the next lemma. 
Observe that for any N G Z+, a non-averaging sequence 
over Zjv is automatically a non-averaging sequence (over Z+). 
The result of part (a) of the lemma is thus a straightforward 
application of the classical upper bound, due to Roth [9, 
Section 4.3, Theorem 8], on the cardinality of the largest non- 
averaging sequence in [0, — 1]. 

Lemma 8. (a) (Roth's theorem) The cardinality of any non- 
averaging sequence over Zat is bounded from above by 
cqN/ log log N, for some fixed constant co > 0. 

(b) For any odd integer N > 0, the cardinality of a Sidon 
sequence over Zjv is bounded from above by y/ N — 3/4+ 1/2. 

We defer the proof of part (b) of the above lemma to the 
Appendix. In terms of the quantity s{q; il), the lemma can be 
re-stated as; 

(a) If il contains the equation 2x — y — Z = (mod q), then 
s{q; fl) < Co q/ log log q, for some fixed constant cq > 0. 

(b) If ft contains the equation X + y — Z — U = (mod q), 
then s{q;n) < ^g- 3/4 +1/2. 

In a PAC with modulus q and column-weight r > 3, the 
equation 2x — y — Z = (mod q) always governs six-cycles, 
as can be seen by setting ri = 0, r2 = 1 and r2 = 2 in (|5}- So, 
if a shortened PAC has girth eight, then its sequence of block- 
column labels must not contain solutions to 2x — y — Z = 
(mod g), i.e., must be non-averaging over Zq. Hence by 
Lemmata), the number of block-columns in the parity-check 
matrix of the shortened PAC cannot exceed c^q/ log log g. 

Similarly, in an array code with modulus q, the equation 
X + y — Z — U = (mod q) always governs eight-cycles that 
pass through any two distinct block-rows and four distinct 
block-columns (see, for example, the cycles on the bottom- 
right of Figure nj. So, if an array code is shortened to obtain 
girth ten, then the sequence of block-column labels retained 
in the shortened code must be a Sidon sequence over Z,, 
and therefore. Lemma |8lb) applies. We have thus proved the 
following theorem. 

Tiieorem9. (a) The number of block-columns in the parity- 
check matrix of a shortened PAC with modulus q, column- 
weight r > 3 and girth eight cannot exceed c^q/ log log q. 

(b) The number of block-columns in the parity-check matrix 
of a shortened array code with modulus q, column-weightr > 2 
and girth ten is at most \J q — 3/4 +1/2. 



Roughly speaking, the above theorem says that the rate of 
a shortened PAC with modulus q, column-weight r > 3 and 
girth eight cannot be more than 1 — r. Similarly, the 

rate of a shortened array code with modulus q, column-weight 
r > 2 and girth ten is, as a rough estimate, bounded from 

above by 1 ^. 

^ n 

It is natural to want to compare the bounds of Theorem|9|to 

those obtained from the application of the Moore bound to the 
Tanner graphs of array codes. The Moore bound^ for a bipartite 
graph [11] bounds the number of vertices in the graph in terms 
of the girth and the average left and right degrees. Consider 
a bipartite graph with ul left vertices, nR right vertices, m 
edges and girth g. Let (11 — he the average left degree, 
da — the average right degree. Then, 

riL > (dH-l)^^/'^(dL-l)L^/^J (16) 

i=0 
9/2-1 

riR > E idL-iy^/'^ (d;v^-l)LV2J. (17) 

i=0 

The above bounds are easily proved for bi-regular bipartite 
graphs, i.e., graphs in which each left (resp. right) vertex has 
degree (resp. dji). 

Now, the Tanner graph of an array code of modulus q, 
column-weight r and having s block-columns is bi-regular 
with TiL — qs, nji — qr, di, — r and dfi = s. So, for such a 
Tanner graph of girth eight, the bound in M6\ becomes 

qs > l + (s-l) + (s-l)(r-l) + (s-l)2(r-l) 
= ,[l + (s-l)(r-l)], 

which yields the bound 

s<l + ^— |. (18) 

r — 1 

The bound in (I17> also gives exactly the same result. Note 
that this bound is, asymptotically in q, looser than the bound 
in Theorem |9[a). But for practical purposes, this is a more 
useful bound than that of the theorem because the cq in the 
theorem is not explicitly specified. 

On the other hand, applying il6\ to the Tanner graph of an 
array code of girth ten, we get 

qs > s [1 + (s - l)(r - 1)] + (s - lf{r - if, 
which upon re-arrangement becomes 

r{r - l)s^ - [r(2r - 3) + g] s + (r - 1)^ < 0. 
Solving for s now yields 

g + r(2r - 3) + ^/(g + r{2r - 3))2 - 4r(r - 1)3 

^ - 2r(r - 1) ■ 

(19) 

For q ^ r^, this upper bound is roughly ^^t^tyj- It is clear 
that in most cases of interest, this is not as good a bound as 
that of Theorem |9jb). We would like to remark that another 

^To be coiTect, this should be called a Moore-type bound, as the original 
Moore bound (see [3, p. 180]) only applies to regular graphs. 
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upper bound can be obtained via ( I17t . but this turns out to be 
looser than the bound in ( I19> . 

We summarize the above bounds in the following theorem. 

Theorem 10. (a) The number of block-columns in the parity- 
check matrix of a shortened array code with modulus q, 
column-weight r and girth eight cannot exceed l + {q—l)/{r — 

1)- 

(b) The number of block-columns in the parity-check matrix 
of a shortened array code with modulus q, column-weight r and 
girth ten is at most 



q + r{2r - 3) + .y{q + r{2r - 3))2 - 4r(r - 1)3 
2r{r - 1) 



B. Lower bounds on s{q; 

We next consider the converse problem of finding lower 
bounds on the size of integer sequences avoiding solutions 
to a collection of cycle-governing equations. The problem of 
constructing long sequences of integers that do not contain 
solutions to certain kinds of homogeneous linear equations 
has a long history. For example, large non-averaging subsets 
of [1, N] were described or constructed by Behrend [1], Moser 
[21] and Rankin [25], using geometrical arguments. We will 
generalize some of these results to cover certain classes of 
equations of the form given in (I15> . 

We start with a lower bound on the maximum length of 
sequences that are Q-non-averaging over Zg, for £ distinct 
integers Ci £ [2,q — 2]. The proof of this bound is provided 
in the Appendix. 

Theorem 1 1 . Let £ > 1, and letflbe the collection of equations 

x+c^y^ {ci + l)z, i^l,2,...,e, 

for some constants Ci G [1,9—2] such that a ^ cj for i ^ j. 
Then, 

-.2 \ 1/3 



s{q-n) > 



3q' 



e{q~i) 



The lower bound derived in the theorem above is quite 
loose. For example, for q — 241, a greedy algorithm (to be 
described in Section 5) produces a sequence of 15 integers that 
is simultaneously non-averaging and 2-non-averaging over Zg. 
However, the theorem applied with ^ = 2, ci = 1 and C2 = 2 
gives a lower bound of 8 for the cardinality of such a sequence. 

A more general lower bound can be derived by extending 
a result of Behrend [1] derived originally for non-averaging 
sequences. Consider the following system, fl, of £ equations 
in the variables Ui, U2, . . . , u„i, v: 



n 



E7n 
.7=1 



(20) 



where the coefficients Cij , bi are non-negative integers such 
that for each i G [l,£], at least two of the Cij's are nonzero, 
and J2T=i =bi>0. 



Theorem 12. Given a system, il, as in f20l) . let D = 

maxi<i<£ hi. Then, forq > D^, 

s{q;n) > ^ige-T^^^-^'°«^°s« (l + o(l)) 



where log denotes the natural logarithm, 71 — y \ log!?, 
72 = 2-^/2 log!?, and o(l) denotes a correction factor that 
vanishes as q ^ 00. 

We postpone the proof of the theorem to the Appendix. 
The above result can be compared directly to the result of 
Theoremll II since the system of equations X + c^y = (q + l)z, 
i = 1,2, . . . ,£, is of the form given in ( I20t . Therefore, the 
result of Theorem applies to this system of equations fl, 
with D = 1 + maxi ci. It is easily seen that by the bound of 
Theorem 

s{q;n) 
lim — — — = 00 

q^QO q^ ^ 

for any e > 0. Since e can be chosen to be arbitrarily small, 
this is much stronger, asymptotically in q, than the result of 
Theoremll II which only shows that s{q; Q) > C q^l'^ for some 
constant C > independent of q. However, for small values of 
q, particularly for the values of the modulus q typically used 
in practical array code design, the bound of Theorem ^2 is 
better than that of Theorem [21 For instance, when applied to 
the system, fi, consisting of the pair of equations X + y = 2z 
and X + 2y = 3z, the bound of Theorem [T2I for q = 241, 
evaluates to 0.66, which just shows that s(g; fi) > 1. As stated 
earlier, the bound of Theorem 1111 vields s(g; fi) > 8 in this 
case. 

To conclude this section, we remark that while the problem 
of precisely estimating the growth rate of s{q; SX) with q 
is one of considerable interest and value, finding provably 
good estimates is a notoriously difficult problem. For example, 
the current best lower bound for the growth rate of the 
cardinality of non-averaging sequences is that due to Behrend 
(Theorem [21 for the special case of 51 consisting of the single 
equation X + y = 2z), but it is still not known whether this is 
the best possible such bound. 

V. Construction Methods 

The simplest and computationally least expensive methods 
for generating integer sequences satisfying a given set of 
constraints are greedy search strategies and variations thereof. 
A typical greedy search algorithm starts with an initial seed 
sequence that trivially satisfies the given constraints, and 
progressively extends the sequence by adding new terms that 
continue to maintain the constraints. 

As an example, to construct a non-negative integer sequence 
that contains no solutions to any equation within a system, $7, 
of cycle-governing equations of the form ( I15> . we start with 
a seed sequence of m — 1 non-negative integers, ni < 712 < 
. . . < Um-i, where m is the least number of variables among 
any of the equations in ft. For each j > m, we take rij to be 
the least integer greater than nj^i such that {ni,n2, ■ ■ ■ , nj} 
contains no solutions to any equation in 51. The rate of growth 
of elements in a sequence generated by such a greedy search 
procedure is influenced by the choice of the seed sequence 
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[24]. The search needs to be performed only once to generate 
a sequence of integers avoiding solutions to any equation in 
il, and it is easily seen that the algorithm has complexity 
0{L ■ (7*^+^), where L denotes the number of equations in 
il, M is the maximum number of variables among all these 
equations, and q is the prime modulus. Tables HlHvl list the 
output of the greedy search procedure, initialized by different 
seed sequences, for finding sequences that avoid solutions to 
various cycle-governing equations in PAC's and lAC's. The 
first two terms of each sequence listed in the tables form the 
seed sequence for the greedy search algorithm. 

There is an alternative procedure that often generates se- 
quences with more terms than a simple greedy search routine. 
The idea is to start with some construction of a dense sequence 
avoiding solutions to some subset of the cycle-governing 
equations in the set Q., and then to sequentially expurgate 
elements of that sequence that violate any of the remaining 
constraints. After the expurgation procedure is completed, 
additional elements may be added to the sequence as long as 
they jointly avoid solutions to all cycle-governing equations 
in fi. 

A good sequence with which to start this alternative pro- 
cedure can be constructed according to a method outlined by 
Bosznay [4] . The construction proceeds through the following 
steps. First, a prime q is chosen, and along with it the smallest 
integer t such that q < f^. Let 



and let S' — {ni,n2, . . . , nt^i} n [0, 9 — 1]. It can be shown 
that the sequence S' does not contain proper solutions over 
Zq to any equation of the form 

m 
i=l 

where ci, C2, . . . , Cm, are positive integers such that 
X]"=i — ^- Next, one uses a simple greedy algorithm to 
find the largest subset S C S' that does not contain proper 
solutions to cycle-governing equations in fi that are not of the 
above form. The last step in the procedure is to check whether 
there exist integers in [0, q—l] that can be added to S without 
creating a proper solution within S to some cycle-governing 
equation. If such integers exist, they are sequentially added to 
the set S. 

As illustrative examples, we list three sequences con- 
structed using the adaptation of Bosznay's method described 
above. The sequence 1,4,8,23,40,126,253,352,381,495 
constructed by this method does not contain solutions 
to any of the equations listed in Table |lll] that govern 
cycles of length six and eight in a PAC with modu- 
lus g = 911 and column- weight r = 4. In compari- 
son, the greedy algorithm initialized by the seed sequence 
0, 1 produces 0, 1, 5, 18, 25, 62, 95, 148, 207. The sequences 
6, 8, 165, 217, 435, 654, 1095 and 0, 1, 7, 29, 64, 111, 753, gen- 
erated by the modified Bosznay construction and the greedy 
algorithm with seed sequence 0,1, respectively, avoid solutions 
to any of the equations listed in Table IIVI Finally, in the case 
of the equations in Table IVl the sequences produced by the two 



methods are 2, 4, 28, 217, 255, 435, 654 and 0, 1, 9, 20, 46, 51. 
Observe that the sequences produced by the modified Bosznay 
construction contain terms that are larger in general than the 
terms in the corresponding greedy sequences where almost all 
elements are much smaller than the prime q. 

VI. Simulation Results 

In this section, we present the bit-error-rate (BER) curves 
over an AWGN channel for various (shortened) PAC's and 
lAC's, and also provide comparisons with other codes of 
similar rates and lengths from the existing literature. All array 
codes considered in this section were iteratively decoded using 
a sum-product/belief -propagation (BP) decoder. 

Figures |51 and |3] show the performance curves, after a 
maximum of 30 rounds of iterative decoding, for array codes 
of column-weight 3 and row-weight 6; thus all these codes 
have designed rate 1/2. The prime modulus used for the 
construction of these codes is q — 1213, which yields codes 
with length 7278. The sets of block-column labels used in 
the codes PACr3g6, PACr3g8 and PACr3g8H- in Figure S 
ai-e {0, 1, 2, 3, 4, 5}, {0, 1, 3, 4, 9, 10} and {0, 1, 4, 11, 27, 39}, 
which correspond to a PAC of girth six, a shortened PAC 
of girth eight, and a shortened PAC of girth eight but 
without eight-cycles governed by the equations in Table |lll 
respectively. The codes lACr3g8, lACr3gl0 and IACr3gl2, 
whose performance is plotted in Figure |3| are of girth eight, 
ten and twelve, respectively. The respective sets of block- 
column labels are {0,1,2,5,7,8}, {0,1,5,14,25,57}, and 
{0, 1,7,29,64, 111}. All the lAC's in the figure have block- 
row labels {0, 1, 3}. 

Figures |5] and |6| show the results, after a maximum of 
30 decoding iterations, for codes with designed rate 1/2 and 
column-weight r — 4. The array codes in Figure |5] are 
shortened PAC's with modulus q ^ 911 and length 7288. 
The sequences used for the block-column labels in the codes 
PACr4g6, PACr4g8 and PACr4g8H- ai-e {0,1,2,3,4,5,6,7}, 
{0, 3, 4, 7, 16, 17, 20, 22} and {0, 1, 5, 18, 25, 62, 95, 148}, re- 
spectively. The codes PACr4g6 and PACr4g8 are of girth 
six and eight, respectively, while PACr4g8H- is a code of 
girth eight with no eight-cycles governed by the equa- 
tions in Table |III| The codes lACr4g8 and lACr4gl0 in 
Figure |6l are lAC's of girth eight and ten, respectively, 
that use the set of block-row labels {0,1,3,7}, but dif- 
fer in the modulus and block-column labels used. The 
code of girth eight has modulus q = 911, hence length 
7288, and block-column labels {0,1,2,5,9,10,18,42}. The 
girth-ten code, on the other hand, uses the modulus q = 
1307, so that it has length 10456, and block-column labels 
{317, 344, 689, 1035, 1178, 1251, 1297, 1303} The reason for 
not choosing g to be 911 in the girth-ten code is that none 
of the construction methods discussed in Section 5 produces 
a sequence of length eight without solutions over Zgn to any 
of the equations listed in Table [V] The smallest choice for 
the prime q which does produce a sequence of eight block- 
column labels satisfying the eight-cycle constraints turns out 
to be 1307. 

For comparison purposes, each of Figures |3}{6| also con- 
tains the BER curves for two other codes: a designed rate- 
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TABLE II 

Cycle-governing equations over for PAC's with modulus q and column- weight r = 3, and greedy sequences avoiding solutions 

OVER Z1213 TO THEM. 



Six-cycle equation 


Greedy sequences avoiding the six-cycle equation 


2i- j - k = Q 


0, 1, 3, 4, 9, 10, 12, 13, 27, 28, 30, 38, . . . 
0, 2, 3, 5, 9, 11, 12, 14, 27, 29, 30, 39, . . . 
0, 3, 4, 7, 9, 12, 13, 16, 27, 30, 35, 36, . . . 


Eight-cycle equations 


Greedy sequences avoiding all six- and eight-cycle equations 


2i + j ~k~2l = 
i+j~k-l = 
- 2k = 
2i-j - k = 


0, 1, 4, 11, 27, 39, 48, 84, 134, 163, 223, 284, 333, . . . 
0, 2, 5, 13, 20, 37, 58, 91, 135, 160, 220, 292, 354, . . . 
0, 3, 4, 13, 25, 32, 65, 92, 139, 174, 225, 318, 341, . . . 



TABLE III 

Cycle-governing equations over for PAC's with modulus q and column-weight r = 4, and greedy sequences avoiding solutions 

OVERZ911 to them. 



Six-cycle equations 


Greedy sequences avoiding all six-cycle equations 


2i- j - k = 
3i- j - 2k = 


0,1,4,5,11,19,20,... 
0,2,5,7, 13,18,20,... 
0,3,4,7, 16,17,20,... 


Eight-cycle equations 


Greedy sequences avoiding all six- and eight-cycle equations 


3i- j - k-l = 
3i-2j -2k + l = 
2i-2j -k + l = 
3i-3i + k-l = 
3i - 3j + 2fc - 2« = 
i + j - k - I = 
2i-j - k = 
4i-3j - k = 
3i-2j - k = 
5i -3j -2k = 


0, 1, 5, 18, 25, 62, 95, 148, 207, . . . 
0, 2, 7, 20, 45, 68, 123, 160, 216, . . . 
0, 3, 7, 22, 39, 68, 123, 154, 244, . . . 




E^/N„(dB) E^'N„(dB) 



Fig.3. BER versus E,,/No (dB) for designed rate-1/2 PAC's with r = 3. Fig.4. BER versus Ei,/N(, (dB) for designed rate-1/2 lAC's with r = 3. 



1/2, regular LDPC code of length 8000 with a random-like block-column labels {24,460,610,826,1009,1012}. Among 

structure, as constructed by Mac Kay and Davey in [17], and a the equations in Table HII this label set contains solutions 

"random-label" array code in which the block-row and block- over Z1213 to only one equation, namely, 3i — 2j — k = 0; 

column labels are randomly chosen. The MacKay-Davey code the solution is {i,j,k) — (826,1009,460). Thus, this PAC 

in Figures 15] and U] is a (3, 6)-regular code, while that in contains no six-cycles and relatively few eight-cycles. The 

Figures U] and [S] is a (4, 8)-regular code. The random-label random-label code in Figure|4lis an lAC with the same choices 

code in Figure |3l is a PAC with q = 1213, r = 3 and set of of q, r and block-column labels as in the random-label PAC 
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TABLE IV 

Cycle-governing equations over for IAC's with modulus g, column-weight r = 3 and block-row labels {0, 1,3}, and greedy 

SEQUENCES AVOIDING SOLUTIONS OVER Z1213 TO THEM. 



Six-cycle equation 


Greedy sequences avoiding the six-cycle equation 








0, 1, 2, 5, 8, 9, 10, 16, 18, 21, 33, 35, 37, 40, . . . 


3i-2j - k = 






0, 2, 4, 7, 9, 11, 14, 16, 18, 31, 35, 39, 45, . . . 
0, 3, 4, 5, 8, 11, 13, 19, 20, 21, 32, 36, 40, . . . 


Eight-cycle equations 


Greedy sequences avoiding all six- and eight-cycle equations 


3i — 3j — fc + i = 
3i - 3j - 2k + 21 = 










i + j — k ~ I = 







0, 1, 5, 14, 25, 57, 88, 122, 198, 257, 280, . . . 


2i + j - k-2l = 







0, 2, 7, 18, 37, 65, 99, 151, 220, 233, 545, . . . 


4i-3j - k = 







0, 3, 7, 18, 31, 50, 105, 145, 186, 230, 289, . . . 


2i~ j - k = 









5i - 3j -2k = 









Ten-cycle equations 


Greedy sequences avoiding six-, eight- and ten-cycle equations 


3i — j + k — I — 2m 









3i- j - 2k + 2l - 2m 









3i + j + 2k-3l- 3m 









3i — j — k — I 









3i — 3j — k -\- 1 









3i - 2j + k-2l 









i-4j + k + 21 
3i — jr — 5fc + 31 










3i - i - 4fc + 21 
i-2j + 2k-l 
3i - 2j + 2k-3l 
6i- j - 2k- 31 










0,1,7, 29,96, 148,324 
0, 2, 7, 29, 70, 178, 733 
0,3,7, 26,54, 146,237 


bi- j -2k- 21 









4i - 3j -3k + 21 









3i -2j - k 









+ 3k 









3i + 2j - 5k 









2i-j - k 









6i — j — 5k 









5i — j - Ak 










^ PACr4g6 {n=7288, q=91 1 , g-6} 
-5- PACr4g8 {n-7288, q-91 1 , g-8} 
-B- PACr4g8+ (n-7288, q-91 1 , g-8, lew 8-cycles) 

Random-label PAC (n=7288, q-911, g-6) 
-<)- Mackay-Davey (n^8QO0, (4,8)-regular) 




EJN„ (dB) 



Fig. 5. BER versus Et/No (dB) for designed rate-1/2 PAC's with r = 4. 




-e- IACr4a8 (n= 7288, q=91 1 , g=8) 
^ IACr4g10 (n.10456, q=1307, g=10) 
* Random-label lAC (n=7288, q=91 1 ) 
Mackay-Davey (n^800Q, (4,8)-regulaf) 



1 1.2 1.4 1.6 1.8 2 



2.2 2.4 2.6 2.8 



E,/N„(dB) 

Fig. 6. BER versus E,,/No (dB) for designed rate-1/2 IAC's with r = 4. 



above, but the block-row label set for the code is {3, 4, 7}. The 
random-label PAC in Figure |5] and the random-label lAC in 
Figure IS] have g = 911, r = 4 and set of block-column labels 
{17, 210, 415, 442, 552, 694, 811, 865}; the lAC has block-row 
labels {2,5,7,8}. The set of block-column labels for the 



random-label PAC in Figure |5] supports proper solutions over 
Zgii to several of the equations in Table Hill These equations 
and solutions are tabulated in Table IVII It is clear that this 
array code contains many six-cycles and eight-cycles. 

From the simulation results presented in Figures we 
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TABLE V 

Cycle-governing EQUATIONS overZ, for IAC's with modulus q, coLUMN-WEiGHTr = 4 and block-row labels {0, 1,3, 7}, and greedy 

SEQUENCES AVOIDING SOLUTIONS OVER Zgn TO THEM. 



Equations (Six-cycles) 


Greedy sequences avoiding all six-cycle equations 


3i- j - 2k = 







0,1,2,5, 


10, 12, 19, 25, 27, 41, 42, 46, 50, 60, . . . 


7i — j — 6k = 







0, 2, 4, 9, 10, 17, 20, 34, 36, 45, 55, 61, 71, 77, . . . 


7i - 3j - 4fc = 







0,3,4,5, 


8, 13, 20, 27, 37, 46, 47, 48, 51, 66, . . . 


Equations (Eiglit-cycles) 


Greedy sequences avoiding all six- and eight-cycle equations 


74 An Oh 1 




u 










u 






( I — tj — K -\- I 




U 






74 74 J- "^h 




n 
u 






74 7 4 ^ ai 




u 






74 7 4 \ A 1-t A 1 

(I — t J ~r 4K — 4t 




U 






62 — 6j — k -\- 1 




U 






Qi _ 4j _ 3fc ^ ; 











4j - 4jr' - 3A; -1- 3« 











3i - 3j -2k + 21 











3i — 3j — k + I 











2i-2j - k + l 









0,1,9,20,46,51,280 


i + j — k — I 









0,2, 11, 19,42,83, 118 


9i - 7j - 2k 









0,3,8,25,45,72,142 


7i - 5j - 2k 











5i — 4j — k 











4i -3j - k 











3i -2j - k 











2i-j - k 











5i - 3j - 2k 











8i -7j -k 











6i — 5j — k 











13i - 7j - 6k 











lOi - 7j - 3k 











111 - 7j - 4fc 












TABLE VI 

Solutions over Zgn supported within the set 
{17, 210, 415, 442, 552, 694, 811, 865} to the cycle-governing 

EQUATIONS IN TABLeIiTTI 



Equation 


Solutions {i,j,k) or (i,j,k,l) 


2i- j - k = 
2i-2j - k + l = 

3i - 2j - 2k + I = 
3i -3j + 2k -21 = 
3i - 3j + k - I = 


(811,17,694), (811,694,17) 

(415, 442, 811, 865), (442, 415, 865, 811). 
(694, 865, 210, 552), (865, 694, 552, 210) 

(865, 694, 811, 415), (865, 811, 694, 415) 

(210, 552, 17, 415), (552, 210, 415, 17) 

(694, 865, 17, 415), (865, 694, 415, 17) 



can clearly observe the sharp improvement in performance 
that can be achieved by increasing the girth of an array code, 
or even by partially eliminating cycles of a certain fixed 
length. As girth increases, the BER curves of array codes 
approach that of a random-like LDPC code of similar length 
and degree distribution. This provides concrete evidence in 
support of the widely-held belief that the girth of a code is 
an important factor in determining its performance. This also 
appears to be borne out by the performance of the random- 
label PAC's in Figures |3l and |5] As can be seen from these 
figures, the degradation in performance (in comparison with 
the random-like MacKay-Davey codes) of the random-label 
PAC of column-weight three is significantly smaller than that 
of column-weight four. Recall that the random-label PAC of 



column-weight three contains few short cycles, while the code 
of column-weight four contains many six-cycles and eight- 
cycles. 

The best performance among the array codes we considered 
for our simulations was achieved by IACr4glO, for which there 
was no observed error for 50 million simulated blocks and 
30 iterations of message-passing, implying that at an SNR of 
2.5dB, the BER achieved by the code is less than 10^^. As 
can be seen from Figure |6l this code performs better than the 
random-like MacKay-Davey code of comparable parameters. 

It is worth pointing out that the PAC's of column-weight 
four and the lAC of girth eight and column-weight four signifi- 
cantly outperform their counterparts with column-weight three. 
This is the reverse of the trend observed among LDPC codes 
with random-like structure, as it is known that among such 
codes, (3, 6)-regular codes have the best threshold properties 
at rates below 0.9, as can be clearly seen from the performance 
plots of the random-like MacKay-Davey codes in Figures |51 
and |5] We conjecture that the observed results are a conse- 
quence of the fact that the array codes of designed rate 1/2, 
length around 8000 and column-weight four have minimum 
distance significantly larger than their column-weight three 
counterparts, or that they have relatively few cycles of length 
equal to or exceeding the girth, and probably have almost 
optimal structure, {i.e. they are comparable to random-like 
codes). At the same time, array codes with designed rate 1/2 
and column-weight three show a significant gap away from 
the optimal performance, since for such a degree distribution 
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1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 



E,/N„(dB) 

Fig. 7. Comparison of array codes with some quasi-cyclic codes from [6]. All 
codes in the figure have lengths around 4100, and are (4,9)-regular. 



it is very likely that optimal LDPC codes can have girth much 
larger than twelve, and larger minimum distance than the upper 
bound listed in Table U 

Finally, we provide some data comparing the performance 
of shortened array codes with that of some of the structured 
LDPC codes studied in the existing literature. We start with 
the class of LDPC codes derived in [14] from projective and 
Euclidean geometries over finite fields. Most of these codes 
have much higher rates than the shortened array codes with 
comparable codelengths. Shortened array codes of a certain 
codelength tend not to achieve rates as high as those achieved 
by codes of the same length derived from projective and Eu- 
clidean geometries due to the relatively small density of integer 
sequences avoiding solutions to cycle-governing equations. 
So, to make a fair comparison, we consider, as an example, 
the code of length 8190 and dimension 4095 obtained by 
"extending" the (4095, 3367) Type-I 2-dimensional Euclidean 
geometry code via the column-splitting procedure described 
in [14, Section VI]. This code has rate 1/2, and so can be 
compared with the designed rate- 1/2 shortened array codes of 
similar lengths. As reported in [14, Table III], a shortened pro- 
jective geometry code with parameters (8190, 4095) achieves a 
BER of 10"'' at an SNR of 6 dB, which is 5.82 dB away from 
the Shannon limit of 0.18 dB. On the other hand, the length- 
7288, designed rate- 1/2 code IACr4g8 in Figure |6| achieves 
the same BER at an SNR of slightly less than 2dB, which is 
considerably closer to the Shannon limit. 

Figures Q and |8| provide a comparison of the performance 
of array codes with the codes studied in [6] and [13]. The 
first of these figures compares the performance of a pair 
of lAC's with a pair of random quasi-cyclic codes and a 
random Gallager code from [6, Figure 2]. All the codes in 
the figure have lengths around 4100, and are (4,9)-regular, 
hence have designed rate 5/9. Both the lAC's plotted have 
length 4113, modulus q = 457, column-weight r — 4, and 
block-row labels {0, 1,3, 7}. The set of block-column labels 



^ IAC(n=1337, q=191. r=4, g=6) 
^ IAC(n=1337, q=191. r=4, g-8) 
LU311t ([1331 .560] code, g-8) 




E^/N„(dB) 



Fig. 8. Comparison of array codes with some codes from [13]. All codes being 
compared have lengths around 1330 and designed rates 0.42-0.43. 



used in the two lAC's are {0,1,9,10,22,31,32,172,194} 
and {0, 1, 9, 10, 24, 43, 88, 90, 326}, respectively. The first se- 
quence avoids solutions over Z457 to all the equations gov- 
erning six-cycles listed in Table |V] except for 7i — 3j — Ak — 
which has three solutions — (9,172,1), (10,22,1) and 
(22, 10,31) — within the sequence. The code corresponding 
to this sequence thus has girth six. Of the 25 equations 
governing eight-cycles listed in Table fVl the sequence contains 
solutions over Z457 to exactly 11 equations, the number 
of solutions to these equations being 50 in all. The se- 
quence {0,1,9,10,24,43,88,90,326} contains no solutions 
over Z457 to any of the six-cycle equations in Table [V] 
but contains a total of 68 solutions to 14 of the eight-cycle 
equations. Thus the I AC with this set of block-column labels 
has girth eight, but has considerably more cycles of length 
up to eight than the lAC with the first set of labels, and 
so performs somewhat worse (see Figure [TJ. Overall, the 
performance (in terms of word error rate) of all the codes 
in Figure Q codes is quite similar, but it should be noted that 
the plotted performance of the two lAC's was obtained after a 
maximum of 50 rounds of BP decoding, while the codes from 
[6] were allowed a maximum of 200 rounds of BP decoding. 

In Figure |8| a pair of lAC's is compared with several 
codes of similar lengths and rates taken from [13]. The 
lAC's in the plot all have length 1337, modulus q = 191, 
column-weight r = 4 and block-row labels {0, 1, 3, 7}. They 
differ in the sequence of block-column labels used: one uses 
the sequence {0,1,9,10,22,31,126}, while the other uses 
{0,1,5,6,25,46,151}. The former sequence contains solu- 
tions over Z191 to exactly one six-cycle equation from Table IVl 
— the solutions are (0,126,1), (10,22,1), (22,10,31) and 
(126, 10, 22) — and a total of 44 solutions to 15 eight-cycle 
equations. Thus, this sequence yields a girth-six code. The 
sequence {0, 1, 5, 6, 25, 46, 151} contains solutions over Z191 
to none of the six-cycle equations in Table [V] and altogether 
50 solutions to 15 eight-cycle equations. Thus, despite the fact 
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that the lAC with this sequence of block-column labels has 
girth eight, its performance is almost identical to that of the 
girth-six lAC, which can be explained by the fact that the two 
codes have a similar cycle distribution. The code LUSllt is a 
structured LDPC code based on a construction of a family of 
regular bipartite graphs by Lazebnik and Ustimenko [16]. The 
parity-check matrix of the code is a 1331 x 1331 square matrix 
with row-weight and column-weight 11. The code has girth 
eight, dimension 560 and minimum distance at least 22. [13]. 
The codes i?3, i?4 and R5 are irregular random-like LDPC 
codes with parity-check matrices of column-weight 3, 4 and 5 
respectively. The performance plots of the codes LU311t, R3, 
i?4 and R5 have been obtained from [13, Figure 5], where it is 
stated that a maximum of 500 iterations of BP decoding was 
allowed for each of these codes. The performance of the lAC's 
in the Figure was obtained after a maximum of 50 rounds of 
BP decoding. As can be seen from the figure, the two lAC's 
match the performance of the random-like column-weight-four 
code i?4, and easily outperform the code LU311t. 

VII. Conclusion 

In summary, in this paper, we considered the problem 
of constructing new LDPC codes with large girth based on 
the array code construction of [5]. Our contributions were 
threefold. Fkstly, we provided a simple method for relating 
cycles in the Tanner graph of such codes to homogeneous 
linear "cycle-governing" equations with integer coefficients. 
This yields an approach for constructing codes with a desired 
cycle distribution, based on the existence of integer sequences 
that avoid solutions to the cycle-governing equations. Sec- 
ondly, we provide some bounds on the cardinality of integer 
sequences avoiding solutions to such equations, which give 
useful esimates of the rate penalty incurred in shortening an 
array code to eliminate cycles. Finally, we showed through 
extensive simulations the influence of various kinds of short 
cycles on the performance of LDPC codes under iterative 
decoding. 

Appendix 

We provide proofs of Lemma |8jb) and Theorems ^2 ™d 
1121 in this Appendix. 

Proof of Lemma]Sj[b): Let 5* be a Sidon sequence over Zjv, 
and let V be the set {(a, 6) : a,b G S,a ^ 6}. From the 
definition of a Sidon sequence and the fact that is odd, 
it follows that the mapping / : P ^ [1, A*" — 1] defined by 
/(a, b) = a — b mod N is injective. Therefore, iV — 1 > 
IT^I = |S'|(|5| — 1). Solving the associated quadratic equation, 
we obtain \S\ < ^7V- 3/4 +1/2. ■ 

For the proof of Theorem ^\ we recall some definitions 
from graph theory. A hypergraph, H = is an ordered 

pair of two finite sets: the set of vertices V, and the set 
of edges E, which are arbitrary non-empty subsets of V . 
A hypergraph is called h-uniform if all its edges have the 
same cardinality h, and is called s-regular if all its vertices 
belong to the same number, s, of edges. A set of vertices of a 



hypergraph, H, which does not (completely) contain any edge 
of H is called an independent set. The maximum cardinality 
of an independent set of H is called the independence number 
of H, and is denoted by a{H). 

Proof of Theorem \n\ Let ^l be as in the statement of the 
theorem. Define a hypergraph H{q;il) with vertex set [0,q — 
1], and a set of edges that consists of all triples of the form 



{x, X + t mod q, X + (c; + 1)< mod q}, 



1,2,...,£, 



for X G [0, g — 1] and t e [1,(7—1]. In other words, the 
edges of H{q; fi) are precisely the proper solutions over Zq 
of equations in il. Therefore, a subset of [0, ? — 1] contains 
no proper solution over to any equation in fl if and only 
if it forms an independent set of vertices in H{q;fl). Thus, 
any lower bound on the independence number of H{q; fl) is 
also a lower bound on s{q; fl). We will prove the bound of 
the theorem using the following lower bound [2, p. 136] on 
the independence number of a regular, /i-uniform hypergraph, 
H = iV{H),E{H)y. 

\vm 



a{H) > 



(21) 



It is easily seen that the hypergraph H{q; fl) is 3-uniform 
and ^(q— l)-regular Indeed, for any x E [0, 1], t E [1, q—1] 
and c E [1, ? — 2], the integers x, x + t and x + {c + l)t are 
distinct modulo q, since g is a prime. Therefore, each edge of 
H{q; O) contains exactly three distinct vertices, showing that 
H{q; SI) is 3-uniform. To see that the graph is £{q— l)-regular, 
we only need to observe that for each vertex x E [0,q — I], 
the triples 

{a;, X + t mod q, x + {ci + l)t mod q}, i = 1, 2, . . . , 

1 < ^ < g — 1, form an exhaustive set of distinct hyperedges 
containing the vertex x. 

The number of edges in H{q; SI) can be easily computed 
from the fact that the graph is 3-uniform and £{q— l)-regular, 
so that we must have 3\E{H{q; = £{q~l)\V{H{q; n))\. 
Consequently, \E{H{q;n))\ = {i/S)q{q- 1). The theorem is 
proved by plugging this into the bound of (12 It . ■ 

It is now only left to prove Theorem El The proof uses 
a technique due to FA. Behrend (see [9, Section 4.3, Theo- 
rem 8]), and hinges upon the following lemma. 

Lemma 13. Given a system, fl, as in 120^ . let D — 

maxi<i<£ bi, and let q be an integer larger than D. Pick an in- 
teger n> such that nD < q, and let k = [(log q) / \og{nD + 
1)J . Then, there exists a set, S, of integers from [0, q — 1], of 
cardinality 

|5| >"■ + ""'' 

k 

such that S does not contain a proper solution owerTLq to any of 
the equations in S7. 

Proof. Let D, q, n and k be as in the statement of the 
lemma, and let M = {nD + 1)'' - 1. For each x E [l,M], 
let (a;o, xi, . . . , x^^i) be the {nD + l)-ary representation of 
X, i.e., X = ^Y^i=Q Xi {nD + 1)* with Xi E [0,nD] for 



MlLENKOVIC, KashYAP AND LEYBA: SHORTENED ARRAY CODES 



15 



i = 0, 1, . . . , fc — 1. We will refer to {xq,xi, . . . , Xfe^i) as 
the coordinate vector of x. Define 

Nix) =1 J2 ' 
yo<i<fc-i 

In other words, N{x) is the P-norm of the coordinate vector 
{xo,Xi, . . . , Xk-i)- For an arbitrary integer p > 1, define the 
set 

Rp,n ^{xe [1,M] : 0<x,<nyi, N(xf = p}. 

In other words, i?p_„ is the set of all integers x e [1, M] that 
satisfy two properties: 

(i) the digits xq,xi, . . . , Xk-i in the (nL' + l)-ary expansion 
of X all lie between and n; and 

(ii) N{x)'^ — p, i.e., the coordinate vector of x lies on the 
/^-sphere of radius 

Our goal is to show that there exists a p* > 1 such that 
Rp'^n does not contain a proper solution over Zg to any 
equation in SI, and \Rp' ,n\ > — — It then follows that 
Rp*,n — l = {u—l:uG i?p*,„} is the set S in the statement 
of the theorem. 

We will first show that for any p > 1, Rp.n cannot contain 
a proper solution over to any equation in O. In fact, it 
is enough to show that i?p^„ cannot contain a proper integer 
solution to any equation in 57. This is because for any set 
{wi, ii2, . . . , Urn, f } C i?p,n, wc must have 

j=i j=i \t=o J 




n{nD + 1)* 



< (nD) V [nD + 1)* 



[nD + lf - 1, 



and similarly. 



hv < {nD + l)*^ - 1, 
which together imply 



IE 



< max{ 



< (nD + l)''-! < q. 



Hence, Cij Uj — hiV = (mod q) if and only if Q j Uj — hiV — 
(over the integers). 

Note that, since Q j < bi < D, each digit Xi in the (nZ)+l)- 
ary representation of an element in Rp,n is small enough so 
that there is no carry over when performing any of the sums in 
57. Hence, adding numbers in i?p „ corresponds to adding their 
coordinate vectors. Now, suppose that the ith equation in 57 has 
a solution {ui, U2, u} C i?p,„, i.e., Ci j = 

h,v, with iV(ui)2 = N{u2)^ = ... = N{u„,)^ = N{v)^ = p. 
Then, 



which means that the coordinate vector of w is a convex 
combination of the coordinate vectors of the m integers Uj, 
j — l,...,m, and all these vectors lie on the P-sphere 
of radius y/p. However, by the strict convexity of the P- 
norm, this can happen only if all these coordinate vectors are 
identical, or equivalently, only if ui — U2 — ■ ■ ■ — u„i = v. 
So, Rp.n cannot contain a proper integer solution to any 
equation in 57. 

At this point, the proof turns nonconstructive. Note that the 
union 

IJ Rp,n ^{xe [1, M] : < < n Vi} 

contains (n + I)''' — 1 points in all, since this is the number 
of sequences {xq, . . . , Xk-i) such that < Xi < n for i = 
0, 1, . . . , A: — 1. Furthermore, for any x £ Up>i Rp.n, we have 

N{xr = Elo 



\2 _ v^fc 1 ^2 ^ kn?, SO that 



p>i p=i 

Thus, the union of the sets Rp.n, P = 1,2,..., kn^, contains 
a total of {n + 1)'' — 1 points. Hence, by the pigeon-hole 
principle, there exists a p* e [1, fcn^] such that 



(n+ 1)* 



fe-2 



1 



fc k 
Finally, S — Rp^.n — l = {u — l:uG i?p*.,i} is the set whose 
existence is claimed in the statement of the theorem. Indeed, 
S C [0, M - 1], and since M = {nD + if - 1 < g, we have 
S C [0, q— 1]. Moreover, as (1,1,..., 1) is a solution to every 
equation in 57, and i?p*,,i does not contain a proper solution to 
any equation in 57, S cannot contain such a solution either ■ 

We can now give the proof of Theorem 1 121 

Proof of Theorem 221 Given q > D^, pick an arbitrary 
e > log D/ log g. Then, choosing n = [j^q'^l and applying 
Lemma [Tsl we get the following lower bound on s{q; 57): 



s{q;n)>eD'-iq'-'^{l + o{l)), 



(22) 



where o(l) denotes a correction factor that goes to zero as 
q oo. 

Now, it may be verified that the value of e that 



maximizes the function /(e) = 
1 + v/l + 8(logi7)(logg)) /(41ogq) 



eD^ 



ilogD/logg. 



For q > D^, i / ^ log D/ log g > log D/ log q, so the bound 



in J22I appHes with ^ = \J \ ^ogD/ logq. Plugging this value 
of e into (I22> and manipulating the resulting expression, we 
obtain the bound of the theorem. ■ 
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